Two stationary nonhomogeneous Markov models of nucleotide sequence evolution.
نویسندگان
چکیده
The general Markov model (GMM) of nucleotide substitution does not assume the evolutionary process to be stationary, reversible, or homogeneous. The GMM can be simplified by assuming the evolutionary process to be stationary. A stationary GMM is appropriate for analyses of phylogenetic data sets that are compositionally homogeneous; a data set is considered to be compositionally homogeneous if a statistical test does not detect significant differences in the marginal distributions of the sequences. Though the general time-reversible (GTR) model assumes stationarity, it also assumes reversibility and homogeneity. We propose two new stationary and nonhomogeneous models--one constrains the GMM to be reversible, whereas the other does not. The two models, coupled with the GTR model, comprise a set of nested models that can be used to test the assumptions of reversibility and homogeneity for stationary processes. The two models are extended to incorporate invariable sites and used to analyze a seven-taxon hominoid data set that displays compositional homogeneity. We show that within the class of stationary models, a nonhomogeneous model fits the hominoid data better than the GTR model. We note that if one considers a wider set of models that are not constrained to be stationary, then an even better fit can be obtained for the hominoid data. However, the methods for reducing model complexity from an extremely large set of nonstationary models are yet to be developed.
منابع مشابه
Empar: EM-based algorithm for parameter estimation of Markov models on trees
The goal of branch length estimation in phylogenetic inference is to estimate the divergence time between a set of sequences based on compositional differences between them. A number of software is currently available facilitating branch lengths estimation for homogeneous and stationary evolutionary models. Homogeneity of the evolutionary process imposes fixed rates of evolution throughout the ...
متن کاملOn the use of nucleic acid sequences to infer early branchings in the tree of life.
Simplifying assumptions made in various tree reconstruction methods--notably rate constancy among nucleotide sites, homogeneity, and stationarity of the substitutional processes--are clearly violated when nucleotide sequences are used to infer distant relationships. Use of tree reconstruction methods based on such oversimplified assumptions can lead to misleading results, as pointed out by prev...
متن کاملA stochastic analysis of three viral sequences.
This paper analyzes the nucleotide sequences of three viruses: Kunjin, west Nile, and yellow fever. Each virus has one long open reading frame of greater than 10,200 nucleotides that codes for four structural and seven nonstructural genes. The Kunjin and west Nile viruses are the most closely related pair, when assessed on the basis of matches between their nucleotide sequences. As would be exp...
متن کاملA Nonhomogeneous Hidden Markov Model for Precipitation
A stochastic model for relating precipitation occurrences at multiple rain gauge stations to broad-scale atmospheric circulation patterns (the so-called \downscaling problem") is proposed. The model is an example of a nonhomogeneous hidden Markov model and generalizes existing downscaling models in the literature. The model assumes that atmospheric circulation can be classi ed into a small numb...
متن کاملOn $L_1$-weak ergodicity of nonhomogeneous continuous-time Markov processes
In the present paper we investigate the $L_1$-weak ergodicity of nonhomogeneous continuous-time Markov processes with general state spaces. We provide a necessary and sufficient condition for such processes to satisfy the $L_1$-weak ergodicity. Moreover, we apply the obtained results to establish $L_1$-weak ergodicity of quadratic stochastic processes.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Systematic biology
دوره 60 1 شماره
صفحات -
تاریخ انتشار 2011